Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system
نویسندگان
چکیده
This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with previous findings on a different LVCSR task suggesting that the proposed technique is effective and robust across several conditions. Furthermore we describe integration into RWTH GALE LVCSR system trained on 1600 hours of Mandarin data and present progress across the GALE 2007 and GALE 2008 RWTH systems resulting in approximatively 20% CER reduction on several data set.
منابع مشابه
An Efficient Hierarchical Modulation based Orthogonal Frequency Division Multiplexing Transmission Scheme for Digital Video Broadcasting
Due to the increase of users the efficient usage of spectrum plays an important role in digital terrestrial television networks. In digital video broadcasting, local and global content are transmitted by single frequency network and multifrequency network respectively. Multifrequency network support transmission of global content and it consumes large spectrum. Similarly local content are well ...
متن کاملRecent improvements of the RWTH GALE Mandarin LVCSR system
This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce a new reduced toneme set developed at RWTH. We are using different toneme sets and pronunciation lexica. For the purpose of discriminative training we will show a fast way to transform word lattices between systems using different toneme sets and pronunciation lexica. In addition to various acoustic fr...
متن کاملDevelopment of the GALE 2008 Mandarin LVCSR system
This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce vocal tract length normalization for the Gammatone features and present comparable results for Gammatone based feature extraction and classical feature extraction. In order to benefit from the huge amount of data of 1600h available in the GALE project we have trained the acoustic models up to 8M Gaussi...
متن کاملAn Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...
متن کاملTonal articulatory feature for Mandarin and its application to conversational LVCSR
This paper presents our recent work on the development of a tonal Articulatory Feature (AF) for Mandarin and its application to conversational LVCSR. Motivated by the theory of Mandarin phonology, eight features for classifying the acoustic units and one feature for classifying the tone are investigated and constructed in the paper, and the AF-based tandem approach is used to improve speech rec...
متن کامل